577,196 research outputs found

    Robust estimation of microbial diversity in theory and in practice

    Get PDF
    Quantifying diversity is of central importance for the study of structure, function and evolution of microbial communities. The estimation of microbial diversity has received renewed attention with the advent of large-scale metagenomic studies. Here, we consider what the diversity observed in a sample tells us about the diversity of the community being sampled. First, we argue that one cannot reliably estimate the absolute and relative number of microbial species present in a community without making unsupported assumptions about species abundance distributions. The reason for this is that sample data do not contain information about the number of rare species in the tail of species abundance distributions. We illustrate the difficulty in comparing species richness estimates by applying Chao's estimator of species richness to a set of in silico communities: they are ranked incorrectly in the presence of large numbers of rare species. Next, we extend our analysis to a general family of diversity metrics ("Hill diversities"), and construct lower and upper estimates of diversity values consistent with the sample data. The theory generalizes Chao's estimator, which we retrieve as the lower estimate of species richness. We show that Shannon and Simpson diversity can be robustly estimated for the in silico communities. We analyze nine metagenomic data sets from a wide range of environments, and show that our findings are relevant for empirically-sampled communities. Hence, we recommend the use of Shannon and Simpson diversity rather than species richness in efforts to quantify and compare microbial diversity.Comment: To be published in The ISME Journal. Main text: 16 pages, 5 figures. Supplement: 16 pages, 4 figure

    Robust Estimators are Hard to Compute

    Get PDF
    In modern statistics, the robust estimation of parameters of a regression hyperplane is a central problem. Robustness means that the estimation is not or only slightly affected by outliers in the data. In this paper, it is shown that the following robust estimators are hard to compute: LMS, LQS, LTS, LTA, MCD, MVE, Constrained M estimator, Projection Depth (PD) and Stahel-Donoho. In addition, a data set is presented such that the ltsReg-procedure of R has probability less than 0.0001 of finding a correct answer. Furthermore, it is described, how to design new robust estimators. --Computational statistics,complexity theory,robust statistics,algorithms,search heuristics

    Data and uncertainty in extreme risks - a nonlinear expectations approach

    Full text link
    Estimation of tail quantities, such as expected shortfall or Value at Risk, is a difficult problem. We show how the theory of nonlinear expectations, in particular the Data-robust expectation introduced in [5], can assist in the quantification of statistical uncertainty for these problems. However, when we are in a heavy-tailed context (in particular when our data are described by a Pareto distribution, as is common in much of extreme value theory), the theory of [5] is insufficient, and requires an additional regularization step which we introduce. By asking whether this regularization is possible, we obtain a qualitative requirement for reliable estimation of tail quantities and risk measures, in a Pareto setting

    Robust quantum parameter estimation: coherent magnetometry with feedback

    Get PDF
    We describe the formalism for optimally estimating and controlling both the state of a spin ensemble and a scalar magnetic field with information obtained from a continuous quantum limited measurement of the spin precession due to the field. The full quantum parameter estimation model is reduced to a simplified equivalent representation to which classical estimation and control theory is applied. We consider both the tracking of static and fluctuating fields in the transient and steady state regimes. By using feedback control, the field estimation can be made robust to uncertainty about the total spin number

    A Simple Approach to the Parametric Estimation of Potentially Nonstationary Diffusions

    Get PDF
    A simple and robust approach is proposed for the parametric estimation of scalar homogeneous stochastic differential equations. We specify a parametric class of diffusions and estimate the parameters of interest by minimizing criteria based on the integrated squared difference between kernel estimates of the drift and diffusion functions and their parametric counterparts. The procedure does not require simulations or approximations to the true transition density and has the simplicity of standard nonlinear least-squares methods in discrete-time. A complete asymptotic theory for the parametric estimates is developed. The limit theory relies on infill and long span asymptotics and is robust to deviations from stationarity, requiring only recurrence.Diffusion, Drift, Local time, Parametric estimation, Semimartingale, Stochastic differential equation

    Worst-case estimation and asymptotic theory for models with unobservables

    Get PDF
    This paper proposes a worst-case approach for estimating econometric models containing unobservable variables. Worst-case estimators are robust against the adverse effects of unobservables. In contrast to the classical literature, there are no assumptions about the statistical nature of the unobservables in a worst-case estimation. This method is robust with respect to the unknown probability distribution of the unobservables and should be seen as a complement to standard methods, as cautious modelers should compare different estimations to determine robust models. The limit theory is obtained. A Monte Carlo study of finite sample properties has been conducted. An economic application is included
    corecore